33. CNNs for Image Classification
21 CNNs For Image Classification RENDER V2
Padding
Padding is just adding a border of pixels around an image. In PyTorch, you specify the size of this border.
Why do we need padding?
When we create a convolutional layer, we move a square filter around an image, using a center-pixel as an anchor. So, this kernel cannot perfectly overlay the edges/corners of images. The nice feature of padding is that it will allow us to control the spatial size of the output volumes (most commonly as we’ll see soon we will use it to exactly preserve the spatial size of the input volume so the input and output width and height are the same).
The most common methods of padding are padding an image with all 0-pixels (zero padding) or padding them with the nearest pixel value. You can read more about calculating the amount of padding, given a kernel_size, here .
SOLUTION:
- `nn.MaxPool2d(2, 4)`
- `nn.MaxPool2d(4, 4)`
SOLUTION:
`padding=3`PyTorch Layer Documentation
Convolutional Layers
We typically define a convolutional layer in PyTorch using
nn.Conv2d
, with the following parameters, specified:
nn.Conv2d(in_channels, out_channels, kernel_size, stride=1, padding=0)
-
in_channels
refers to the depth of an input. For a grayscale image, this depth = 1 -
out_channels
refers to the desired depth of the output, or the number of filtered images you want to get as output -
kernel_size
is the size of your convolutional kernel (most commonly 3 for a 3x3 kernel) -
stride
andpadding
have default values, but should be set depending on how large you want your output to be in the spatial dimensions x, y
Read more about Conv2d in the documentation .
Pooling Layers
Maxpooling layers commonly come after convolutional layers to shrink the x-y dimensions of an input, read
more about pooling layers in PyTorch,
here
.